Learning on the fly: a font-free approach toward multilingual OCR
Identifieur interne : 000558 ( Main/Exploration ); précédent : 000557; suivant : 000559Learning on the fly: a font-free approach toward multilingual OCR
Auteurs : Andrew Kae [États-Unis] ; David A. Smith [États-Unis] ; Erik Learned-Miller [États-Unis]Source :
- International journal on document analysis and recognition : (Print) [ 1433-2833 ] ; 2011.
Descripteurs français
- Pascal (Inist)
- Wicri :
- topic : Multilinguisme.
English descriptors
- KwdEn :
Abstract
Despite ubiquitous claims that optical character recognition (OCR) is a "solved problem," many categories of documents continue to break modern OCR software such as documents with moderate degradation or unusual fonts. Many approaches rely on pre-computed or stored character models, but these are vulnerable to cases when the font of a particular document was not part of the training set or when there is so much noise in a document that the font model becomes weak. To address these difficult cases, we present a form of iterative contextual modeling that learns character models directly from the document it is trying to recognize. We use these learned models both to segment the characters and to recognize them in an incremental, iterative process. We present results comparable with those of a commercial OCR system on a subset of characters from a difficult test document in both English and Greek.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000104
- to stream PascalFrancis, to step Curation: 000668
- to stream PascalFrancis, to step Checkpoint: 000112
- to stream Main, to step Merge: 000564
- to stream Main, to step Curation: 000558
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Learning on the fly: a font-free approach toward multilingual OCR</title>
<author><name sortKey="Kae, Andrew" sort="Kae, Andrew" uniqKey="Kae A" first="Andrew" last="Kae">Andrew Kae</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
<author><name sortKey="Smith, David A" sort="Smith, David A" uniqKey="Smith D" first="David A." last="Smith">David A. Smith</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
<author><name sortKey="Learned Miller, Erik" sort="Learned Miller, Erik" uniqKey="Learned Miller E" first="Erik" last="Learned-Miller">Erik Learned-Miller</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">12-0083148</idno>
<date when="2011">2011</date>
<idno type="stanalyst">PASCAL 12-0083148 INIST</idno>
<idno type="RBID">Pascal:12-0083148</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000104</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000668</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000112</idno>
<idno type="wicri:doubleKey">1433-2833:2011:Kae A:learning:on:the</idno>
<idno type="wicri:Area/Main/Merge">000564</idno>
<idno type="wicri:Area/Main/Curation">000558</idno>
<idno type="wicri:Area/Main/Exploration">000558</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Learning on the fly: a font-free approach toward multilingual OCR</title>
<author><name sortKey="Kae, Andrew" sort="Kae, Andrew" uniqKey="Kae A" first="Andrew" last="Kae">Andrew Kae</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
<author><name sortKey="Smith, David A" sort="Smith, David A" uniqKey="Smith D" first="David A." last="Smith">David A. Smith</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
<author><name sortKey="Learned Miller, Erik" sort="Learned Miller, Erik" uniqKey="Learned Miller E" first="Erik" last="Learned-Miller">Erik Learned-Miller</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint><date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Document structure</term>
<term>Greek</term>
<term>Image processing</term>
<term>Iterative method</term>
<term>Iterative process</term>
<term>Modeling</term>
<term>Multilingualism</term>
<term>On the fly</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Reconnaissance forme</term>
<term>Traitement image</term>
<term>A la volée</term>
<term>Multilinguisme</term>
<term>Structure document</term>
<term>Processus itératif</term>
<term>Grec</term>
<term>Modélisation</term>
<term>Méthode itérative</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Multilinguisme</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Despite ubiquitous claims that optical character recognition (OCR) is a "solved problem," many categories of documents continue to break modern OCR software such as documents with moderate degradation or unusual fonts. Many approaches rely on pre-computed or stored character models, but these are vulnerable to cases when the font of a particular document was not part of the training set or when there is so much noise in a document that the font model becomes weak. To address these difficult cases, we present a form of iterative contextual modeling that learns character models directly from the document it is trying to recognize. We use these learned models both to segment the characters and to recognize them in an incremental, iterative process. We present results comparable with those of a commercial OCR system on a subset of characters from a difficult test document in both English and Greek.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Massachusetts</li>
</region>
<settlement><li>Amherst (Massachusetts)</li>
</settlement>
<orgName><li>Université du Massachusetts à Amherst</li>
</orgName>
</list>
<tree><country name="États-Unis"><region name="Massachusetts"><name sortKey="Kae, Andrew" sort="Kae, Andrew" uniqKey="Kae A" first="Andrew" last="Kae">Andrew Kae</name>
</region>
<name sortKey="Learned Miller, Erik" sort="Learned Miller, Erik" uniqKey="Learned Miller E" first="Erik" last="Learned-Miller">Erik Learned-Miller</name>
<name sortKey="Smith, David A" sort="Smith, David A" uniqKey="Smith D" first="David A." last="Smith">David A. Smith</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000558 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000558 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:12-0083148 |texte= Learning on the fly: a font-free approach toward multilingual OCR }}
This area was generated with Dilib version V0.6.32. |